Combining Instance-Based Learning and Logistic Regression for Multilabel Classification (Resubmission)∗

نویسندگان

  • Weiwei Cheng
  • Eyke Hüllermeier
چکیده

Multilabel classification is an extension of conventional classification in which a single instance can be associated with multiple labels. Recent research has shown that, just like for standard classification, instance-based learning algorithms relying on the nearest neighbor estimation principle can be used quite successfully in this context. However, since hitherto existing algorithms do not take correlations and interdependencies between labels into account, their potential has not yet been fully exploited. In this paper, we propose a new approach to multilabel classification, which is based on a framework that unifies instance-based learning and logistic regression, comprising both methods as special cases. This approach allows one to capture interdependencies between labels and, moreover, to combine model-based and similarity-based inference for multilabel classification. As will be shown by experimental studies, our approach is able to improve predictive accuracy in terms of several evaluation criteria for multilabel prediction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MLSLR: Multilabel Learning via Sparse Logistic Regression

Multilabel learning, an emerging topic in machine learning, has received increasing attention in recent years. However, how to effectively tackle high-dimensional multilabel data, which are ubiquitous in real-world applications, is still an open issue in multilabel learning. Although many efforts have been made in variable selection for traditional data, little work concerns variable selection ...

متن کامل

MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation

Learning from imbalanced data is a problem which arises in many real-world scenarios, so does the need to build classifiers able to predict more than one class label simultaneously (multilabel classification). Dealing with imbalance by means of resampling methods is an approach that has been deeply studied lately, primarily in the context of traditional (non-multilabel) classification. In this ...

متن کامل

A probabilistic methodology for multilabel classification

Multilabel classification is a relatively recent subfield of machine learning. Unlike to the classical approach, where instances are labeled with only one category, in multilabel classification, an arbitrary number of categories is chosen to label an instance. Due to the problem complexity (the solution is one among an exponential number of alternatives), a very common solution (the binary meth...

متن کامل

IRDDS: Instance reduction based on Distance-based decision surface

In instance-based learning, a training set is given to a classifier for classifying new instances. In practice, not all information in the training set is useful for classifiers. Therefore, it is convenient to discard irrelevant instances from the training set. This process is known as instance reduction, which is an important task for classifiers since through this process the time for classif...

متن کامل

Efficient Feature Selection and Multiclass Classification with Integrated Instance and Model Based Learning

Multiclass classification and feature (variable) selections are commonly encountered in many biological and medical applications. However, extending binary classification approaches to multiclass problems is not trivial. Instance-based methods such as the K nearest neighbor (KNN) can naturally extend to multiclass problems and usually perform well with unbalanced data, but suffer from the curse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009